Motivation

The violent crime rate in U.S increased by 3.4 percent nationwide in 2016 in US. As an international student, as well as a New Yorker, the public safety in NYC is always a concern to us, especially after the recent terrorists attack near the World Trade Center. Thus, our group decided to make a deeper investigation of the crime data and seek out some underlying reasons which led to the increase of crime rate.

Data Description

NYPD official website provides citywide histroic crime data in forms of excel. We downloaded these datasets and merged them into the nyc_crime_hist. The resulting data frame contain information about the total number of offenses from 2000 to 2016 and major offense categories(felony, misdemeanor, and violation) and detailed descriptions.

We focus our efforts on the data of current year 2017 which is obtained from NYC_OpenData. It includes all valid felony, misdemeanor, and violation crimes reported to the NYPD till October in this year. The latest update of this dataset is October 25, 2017.

Data Explorarion

Map of crimes

Since the dataset has 341716, 9 observations, we randomly sample 50000 observations to creat an interactive map showing locations where the crimes in New York City occured.

In future shiny app, we intend to add widgets that will limit the data size. The map will show crimes happening during a specfic date range or a specific boro.

Crimes analysis

Analyzing the trend of crimes

The historic data shows the trend of crimes per month from 2000 to 2016. The crime number per month is calculated by dividing total crime number by 12 months (9 for 2017 since the data of 2017 is not complete yet.)

We can see that overall the crime number per month are decreasing since 2000. However, misdemeanor crimes increased from 2005 to 2010, and dropped again after 2010.

Focusing on the data of 2017, we made a plot showing total crime based on different offense type in the first 9 months in 2017. The results indicates that Misdemeanor is significantly higher than VIOLATION and FELONY.

Make a plot showing the crime numbers and crime rate based each months and grouped by boro.

Here, we would like to make a deeper investigation about the crime numbers and crime rate based on each month this year. In order to calculate the crime rate, we need to use the population data of NYC. We get this data from the website. We can see from the results that Brooklyn has the most crime numbers this year, but in crime rate, Bronx is the worst. Queens is relatively safer. Also, we could find that in February, there are usually fewer crimes, that’s probably because the weather in February is usually the coldest, makes criminal less willing to go on the street.

After analyzing the trend over year and month, a plot of crime count versus hour is also added. It clearly shows that crime usually happened during daytime. The peak of crimes is during 15:00 to 20:00. Between 3am and 8am, the crime numbers are relatively low.

Common crimes types in each boro

We build a function to get the most prevalent crimes in each boro and illustrate the result.

Comments

  • In Figure a, we presented the top five places where crimes usually happlen across five boros in NYC and it shows that STREET and RESIDENCE are the most unsafe places, then we will look furtherly about the major crime types in these two places for each boro.
  • We obtian the information of crime counts and types through the width and partitioning of bars. It is obviously to conclude that the prevalence of assault, harrassment and criminal mischief are much higher compared to other crimes in most boros.
  • In residence, the occurence of harrassment and assult is more prevalent, while prtit larcency and criminal mischief represent more percentage of criminal types in street.
  • Next, we want to compare the major criminal types in different boros. From Figure b-f, we found that the distribution of crimes are similar among Manhattan, Brooklyn, State Island and Queen. However, the characteristic of crimes in Bronx appeares to be more complex. Specificlly, the nature of crimes in Bronx is usually more serious than the other four boros, with the occurence of FELONY ASSAULT and DANGEROUS DRUGS, which belong to felony.
  • Overall,the safty level in Manhattan, State Island and Queens in relatively higher

Additional analysis

Assoiciation between crime and income

In addition, we have a strong interest in finding potential factors that may associated with criminal rate. In this case, we choose household income level. After reading data from the web, data cleaning and data visualization, we are surprized to see from the scatter plot: Both lower-income borough and higher-income borough have an extremely high crime rate.

For example, Bronx borough’s family median income is 35176 dollars, associated with a crime rate of 0.029. That is, we expect 29 crime cases among every 1000 people. In contrast, Family income ranged between 60000 dollars to 70000 dollars tends to have the lowerest crime rate. Taking Queens as an example, we expect only 15 crime cases among every 1000 people.

Investigate offense description

We plot the top 10 words in of offense description:

The graph analyzes top 10 words showing in offense description. The most frequent one is larceny, which appears nearly 100000 times. Other frequent words including related, petit, assault, harrassment, etc. Most of them indicated the type of crime, which is consistent with what we expect.

We compare distinct words in offense type of violation and felony.

The above chart compares distinct words(that is, words that appear much more frequently in one group than the other) in offense type of violation and felony. We can see that larceny, robbery, burglary,etc., appear more frequently in offense description of felony crime, while harrassment, gambling, loitering appear more frequently in offense description of violation crime. In terms of the results, we can obtain a basic picture of the difference between felony and violation.

Summary

Our analysis in focusing on providing information about crimes in NYC.